Towards zero-shot cross-lingual named entity disambiguation

نویسندگان

چکیده

• Novel zero-shot cross-lingual Named Entity Disambiguation approach. Robust system that does not require native prior probabilities. Purpose-built multilingual method outperforms generic models such as XLM-R. English is necessarily the most effective training language for zero-shot. New dataset Basque/English, which facilitates further research. In cross-Lingual (XNED) task to link mentions in text some entities a knowledge graph. XNED systems usually data each language, limiting their application low resource languages with small amounts of data. Prior work have proposed so-called transfer are only trained data, but required probabilities respect mentions, had be estimated from examples, practical interest. this we present architecture where, instead single disambiguation model, model possible mention string, thus eliminating need Our improves over datasets Spanish and Chinese by 32 27 points, matches do information. We experiment different strategies, showing better results obtained purpose-built pre-training compared state-of-the-art also discovered, surprisingly, into English. For instance, more when disambiguates Basque an

برای دانلود باید عضویت طلایی داشته باشید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Cross-lingual named entity extraction and disambiguation

We propose a method for the task of identifying and disambiguation of named entities in a scenario where the language of the input text differs from the language of the knowledge base. We demonstrate this functionality on English and Slovene named entity disambiguation

متن کامل

Cross-Lingual Named Entity Recognition via Wikification

Named Entity Recognition (NER) models for language L are typically trained using annotated data in that language. We study cross-lingual NER, where a model for NER in L is trained on another, source, language (or multiple source languages). We introduce a language independent method for NER, building on cross-lingual wikification, a technique that grounds words and phrases in nonEnglish text in...

متن کامل

Label Embedding for Zero-shot Fine-grained Named Entity Typing

Named entity typing is the task of detecting the types of a named entity in context. For instance, given “Eric is giving a presentation”, our goal is to infer that ‘Eric’ is a speaker or a presenter and a person. Existing approaches to named entity typing cannot work with a growing type set and fails to recognize entity mentions of unseen types. In this paper, we present a label embedding metho...

متن کامل

Cross-lingual Transfer of Named Entity Recognizers without Parallel Corpora

We propose an approach to cross-lingual named entity recognition model transfer without the use of parallel corpora. In addition to global de-lexicalized features, we introduce multilingual gazetteers that are generated using graph propagation, and cross-lingual word representation mappings without the use of parallel data. We target the e-commerce domain, which is challenging due to its unstru...

متن کامل

Exploring Entity Relations for Named Entity Disambiguation

Named entity disambiguation is the task of linking an entity mention in a text to the correct real-world referent predefined in a knowledge base, and is a crucial subtask in many areas like information retrieval or topic detection and tracking. Named entity disambiguation is challenging because entity mentions can be ambiguous and an entity can be referenced by different surface forms. We prese...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

ژورنال

عنوان ژورنال: Expert Systems With Applications

سال: 2021

ISSN: ['1873-6793', '0957-4174']

DOI: https://doi.org/10.1016/j.eswa.2021.115542